# PERFORMANCE ANALYSIS OF DIFFERENTLOW LEAKAGE FULLY HALF-SELECT-FREE ROBUST SRAM CELLS

<sup>1</sup>G LAXMI PRIYANKA,<sup>2</sup> CH SWATHI, <sup>3</sup> G RAMESH <sup>1,2,3</sup>Assistant professor, ECE Department, St.Martin's Engineering College,Sec

# ABSTRACT

The increasing integration density of microelectronic circuits in combination with nonconstantly scaled supply voltages results in higher electric fields in MOS transistors. This is one central source of several aging mechanisms, some of them shifting the parameters of MOS transistors during lifetime. These parametric degradation effects can be separated in two groups called 'Bias Temperature Instability' (BTI) and 'Hot Carrier Injection' (HCI). This work focuses on the impact of these degradation mechanisms on 6-Transistor Static Random Access Memory (SRAM) arrays in 65 nm low power CMOS technology. Monte-Carlo simulation confirms low-voltage operation without any additional peripheral assist circuits. We also present a comparative analysis of Bias Temperature Instability (BTI) reliability impacting the SRAM performance in a predictive 32nm high-k metal gate CMOS technology. Under static stress, the Read Static Noise Margin (RSNM) reduces for all cells. However, 11T-1 and 11T-2 cells improve RSNM by 2.7% and 3.3% under relaxed stress of 10/90. Moreover, the proposed 11T-1 (11T-2) cell improves WM by 7.2% (13.2%), reduces write power by 28.0% (20.4%) and leakage power by 85.7% (86.9%), degrades write delay by 38.1% (23.3%) without affecting read delay/power over a period of 108 seconds (approx. 3 years). The 11T-1 (11T-2) cell exhibits 4.8% higher (2% lower) area overhead as compared to earlier 11T cell. Hence, the proposed 11T cells are an excellent choice for reliable SRAM design at nanoscale amidst process variations and transistor aging effect and can also be used in bit-interleaving architecture to achieve multi-cell upset (MCU) immunity.

## 1. INTRODUCTION

Static Random Access Memory (SRAM) nowadays is a dominant part of Systems-on-Chip (SoC). Up to about half of the die area and 2/3 of the transistor count of a modernmicroprocessor consists of SRAM cells. Fig. 1.1 shows the die photo of an Intel Penrynprocessor manufactured in 45 nm technology [www.intel.com]; the SRAM area can beidentified on the left half of the die with its characteristic homogeneous layout style.6 MB of SRAM Cache memory equals approx. 300 million transistors, which is 73% of the complete number of 410 million transistors.



Fig. 1.1: Intel Penryn Processor: about half of the die area and 2/3 of transistorcount consist of SRAM, identifiable on the left half of the die

Systems on Chip will require more and more memory in the future. 90% of the die areaare projected to be memory in the next 10 years [1]. Die area directly translates tocost. To get maximum memory capacity on smallest possible area, the two obvious mainapproaches are: 1. minimize transistor sizes, 2. densify transistor packaging. This is whySRAM has the smallest transistors and the highest transistor density of the whole chip.This work focuses on the behavior of SRAM cells made of minimum size transistors withspecial tight design rules with respect to parametric reliability issues.

Recently, Multi-bit soft error/upset (MCU) has threatened the stability of SRAMs at ultra-scaled technology due to the reduction in effective distance between transistors. Bit-interleaving (BI) architectural technique is an efficient way to deal with this error. However, this technique is applicable to the cells, which exhibit fully halfselect (HS) free operation. The straight forward approach to achieve HS free operation is to use cross-point cell selection, where write path consists of two access transistors controlled by different row and column based signals. However, stacked transistors in the write-access path severely degrade the write-ability, which makes it necessary to use WL boosting for both the row-based and columnbased Write WL at the expense of dynamic power. The two BI cells 11T and 12T were proposed that eliminate HS disturb again by using cross-point selected series connected access transistors. Nevertheless, these cells improve the Write-ability by using Power Cutoff Write-assist and do not require wordline boosting; they suffer from degradation of floating-1 level of data storing nodes Q or QB in column write HS cells. They require an extra Pulse-Width-Controller in the column circuitry to achieve very precise pulse width for word-lines during write operation to retain the data in the column write half-select (CHS) cells. Recently, a BI power gated 9T cell has been proposed to solve the HS issue, however the power cut-off used during the write operation again leads to floating of data at storing nodes Q in row half-select (RHS) cells. Therefore, in this work, we propose two new 11T cells that mitigate the HS issue without using write-back or any other assist techniques and support a BI architecture to improve MCU immunity. The first proposed cell (termed as 11T-1) uses supply-cut-off and write '0' only whereas the second proposed cell (termed as 11T-2) uses ground-cut-off and write '1' only technique for write-ability enhancement. The power cut-off in proposed cells does not lead to floating of data storage nodes in any of the HS cell contrary to the existing 11T.

Scaling below 32nm node leads to reliability concern characterized by progressive degradation of devices due to aging. Bias temperature instability (BTI) is one of the major reliability issue encountered by devices due to aggressive scaling. Negative bias temperature instability (NBTI), observed primarily in PMOS has been the biggest concern of reliability over the years but with the introduction of high-k metal gate and its dependence on charge trapping places the positive bias temperature instability (PBTI) as the major reliability issue in NMOS devices. NBTI and PBTI increase the threshold of the transistor with stress time and consequently, degrade the performance of the circuit. It is, therefore, crucial to analyze the impact of NBTI and PBTI on different SRAM performance metrics.

## 2. LITERATURE REVIEW

This article presents the simulation of 6T, 9T, LP10T, ST10T and WRE8T SRAM cells. All the simulations have been carried out on 90nm at Microwind EDA tool.

## 2.1 6T SRAM cell

Kim TH, Liu J, Keane J, Kim CH. 2008 proposes 6T. Fig. 2.1 shows the circuit diagram of a conventional SRAM cell [8]. Before the read operation begins, the bit line (BL) and bitbar line (BLB) are precharged to as high as supply voltage Vdd.

When the word line (WL) is selected, the access transistors are turned on. This will cause a current to flow from supply voltage (Vdd) through the pull up transistor TP1 of the node storing "1". On the other side, current will flow from the precharged bitbar line to ground, thus discharging bitbar line. Thus, a differential voltage develops between the BL and BL. This small

potential difference between the bit lines is sensed and amplified by the sense amplifiers at the data output.



### Fig 1: Conventional 6T SRAM cell 2.2 9T SRAM cell

Liu Z and Kursun V. 2008 introduce 9T SRAM [9] is shown in Fig.2.2. Write occurs just as in the 6T SRAM cell. Reading occurs separately through N5, N6 and N7 controlled by the read signal (RWL) going high. This design has the problem of the high bit line capacitance with more pass transistors on the bit line.



# Fig 2: 9T SRAM Cell 2.3 Fully Differential Low Power 10T SRAM

Singh S, Arora N, Gupta N, Suthar M. 2012 proposes the fully differential low power 10T SRAM [10] bit cell is shown in Fig.2.3. The design strategy of cell is the series connection of a tail transistor. The gate electrode of this device is controlled by the output of an XOR gate, inputs of which are tapped from write word line (WWL) and read word line (RWL) control signals coming from the WWL and the RWL drivers. The XOR gate and the tail transistor are shared by all the cells in a row. The tail transistor has to be appropriately up sized for sinking currents from all the cells in the row. Without this read buffer, a cell with such small drivers and series connected tail transistor would exhibit unacceptably low

read static noise margin (RSNM), resulting in read instability.



Fig 3: Fully Differential Low Power 10T SRAM (LP10T) 3. SRAM FUNDAMENTALS

In computer memory hierarchy, the fast and smallcapacity memory types are on top,while the slow and huge-capacity memories are at the bottom (Fig. 3.1). While discof DRAM type typically stores some Gigabytes but has a factor of 105-106 faster accesstime of approx. 10 -100 ns. Most of these faster techniques are based on charging ordischarging of capacitors, which takes some time for transportation of charge.



Fig. 3.1: Memory hierarchy of a Personal Computer (PC)drives store Terabytes of data and have access time in the 10 ms region, main memory Often theyrepresent dynamic memories, which must be refreshed in fractions of a second to enablelong storage time [6]. To further improve access time by a factor of 10 or more in orderto get to the top of the memory hierarchy pyramid, the principle of positive feedback isused. No charge must be stored, positive feedback is a technique that brings a circuit toits extreme values and therefore realizes bistable systems. In case of memory those are thetwo binary states '1' and '0'; systems using this technique are called 'Flip-Flops'. SRAM, latches and registers are based on that principle: their killer feature is having extremelyfast read and write access. Latches, which are level-sensitive and typically used to buildsequential logic circuits, are often based on cross-coupled NAND or NOR gates, compareFig. 3.2(a). Its advantage is the simple usage and asynchronous data interface, it can be easily read and written with 3 signals R, S and Q, compare Fig. 3.2(b). Q alwayskeeps the stored information, and setting R or S to 1, while keeping the other signal at0 resets or sets the latch. To build edge-triggered registers, the level-sensitive latch mustbe transformed to a synchronous circuit, which more transistors. is adding some So thedisadvantage is the big area consumption: typically 10 to 30 transistors are required tostore only one binary digit (bit).



(a) SR FlipFlop is a NOR-basedlatch with the typical crosscouplingfor positive feedback.

| S | R | Action    |
|---|---|-----------|
| 0 | 0 | No change |
| 0 | 1 | Q=1       |
| 1 | 0 | Q=0       |
| 1 | 1 | forbidden |

(b) Table for reading andwriting the latch with3 signals.

Fig. 2.2: Simplest latch: asynchronous SR Flip-Flop which can be used to buildsequential logic circuits [6].

SRAM on the other hand needs a complex periphery to read or write a distinct cell in ahuge array of core cells, compare Fig. 3.3 [7]. Reading and writing are complex procedures, which will be described in section 2.1. Due to the periphery overhead, SRAM cells do notmake sense as single latch cells. So one SRAM cell never comes alone, the typical SRAMarray size is at least some thousand to some millions of cells, which makes it

a kBit orMBit array. Therefore SRAM is also a great test vehicle for variability examinations.

So the key performance of SRAM compared to all other memory types is speed: SRAMhas about 1 ns read and less than 1 ns write access time. It is typically used inside amicrocontroller for cache memory, which is divided in several, normally up to 3, speedor hierarchy levels. Level 1 cache is clocked with CPU frequency, which is some GHz.Therefore, a memory type with less than 1 ns access time is needed. This level nowadaysnormally has a size of 4 to 64 kB.Level 2 is much bigger, about 64 kB to 12 MB. Sometimes it is not on the CPU itself, and it is always clocked slower, e.g. with some hundreds of megahertz.

For the advantage of high speed, one SRAM cell needs about 10 to 15 times more areathan a DRAM cell [1], which directly translates to cost. One SRAM cell in 65 nm is about  $0.5 - 0.7 \mu m2$ , the core cells examined in this work have a size between 0.6  $-0.7 \mu m2$ . Tokeep this area as small as possible, they are built with especially tight design rules. This is possible because of their regular layout. They allow to place more minimum sizetransistors than for conventional logic, so SRAMs do have special status in semiconductormanufacturing.



Fig. 3.3: SRAM block diagram showing the core cell array and the periphery containingrow/column decoder and sense amplifier. **4. PROPOSED SRAMCELLS** 

## A. Proposed 11T-1 Cell

Fig. 4.1 shows the schematic diagram of the proposed 11T-1 SRAM cell. The cell core consists of cross coupled inverter with the addition of Power cut-off with floating-avoidance assist (PCFA). The transistors MP1 and MP3 in PCFA network internally cut off the supply voltage to weaken the pull-up path and provide contention-free discharge of the storage node to improve the write-ability. Whereas, transistor MP2, driven by row based WL avoids the floating-1 situation in CHS cells. The write access transistors MAL and MAR are controlled by column based WLA and WLB signals. Table-I illustrates the status of the control signals in different modes



Fig. 4,1. Proposed 11T-1 cell schematics.



## **Fig. 4,2. Proposed 11T-2 cell schematics.** TABLE I

CONTROL SIGNALS DURING VARIOUS MODES OF OPERATION FOR THE PROPOSED 11T-1/11T-2 CELL

| Control | Operation |      |           |           |  |  |
|---------|-----------|------|-----------|-----------|--|--|
| Signal  | Hold      | Read | Write '0' | Write '1' |  |  |
| WLA     | 0/1       | 0/1  | 1         | 0         |  |  |
| WLB     | 0/1       | 0/1  | 0         | 1         |  |  |
| WL      | 0/1       | 1/0  | 1/0       | 1/0       |  |  |
| RBL     | 1         | Pre  | 0/1       | 1         |  |  |
| RWL     | 0         | 1    | 0         | 0         |  |  |
| VVSS    | 1         | 0    | 0/1       | 0/1       |  |  |

of operation of the proposed cells. During the Write '0' operation, WLA and WL signals are enabled, whereas WLB and VVSS are disabled. The left inverter is completely cut-off from power supply and node Q is easily discharged through transistors MAL and MR2. Similarly for write '1', the WL and WLB are enabled, whereas WLA is disabled. The supply is now cut-off for right inverter and node QB is discharged easily through MAR and MR2 and consequently '1' is written at node Q. The read operation is accomplished by enabling WL signal and keeping WLA and WLB both at '0'. The RBL is pre-charged prior to read operation. The discharging path will be on for RBL through transistors MR1 and MR2 depending on the data stored at QB. The disabled WLA and WLB signals enables complete isolation of data storage nodes (Q and QB) from any read disturbing path during the read access. Therefore, the 'read upset' is of no concern even for subthreshold operation. In the Hold Mode, all the control signals are disabled, which provides a completely isolated cross-coupled inverters without any floating node. Therefore, the cell stability in the hold mode is same as 6T cell. The VVSS signal is kept high, which significantly reduces the static power consumption during standby mode.

## B. Proposed 11T-2 Cell

Fig. 2 shows the schematic diagram of the proposed 11T-2 SRAM cell. It consists of the similar cell core with the additional Ground-cut-off with floating avoidance assist (GCFA) comprising of MN1, MN2 and MN3. The transistors MN1 and MN3 in GCFA internally cut-off the ground during write-operation and provide contention free charging of high-going node for improving the write-ability. Whereas, transistor MN2, driven by row based WL, prevents the floating-'0' situation in CHS cells. The cell utilizes the single-ended sensing with an additional read buffer comprising of transistors MR1 and MR2. VVSS signal is used to eliminate unnecessary leakage during standby mode. The write access transistors MPAL and MPAR are controlled by column based WLA and WLB signals. Transistor MPU is controlled by row based WL signal, and is shared in a row.

During the Write '0' operation, WLA is enabled, whereas WLB and WL signals are disabled. The right inverter is completely cut-off from ground path and node QB is easily pulled-up through transistors MPAR and MPU without the contention from pull-down transistor MNR. Consequently, Q is discharged to ground through MNL and MN1. The write '1' follows similar procedure due to symmetric write operation.

5. EXPECTED RESULTS



Fig. 5.1 : Simulation of SNM(read): if HCI degradation is applied to the '0' memory side, SRAM is gaining 8% stability.



Fig. 5.2: Timing behavior of a write cycle: done in 0.5 ns.



Fig. 5.3. Leakage power as a function of stress time under condition of relaxed stress of 10/90. (Inset: Expanded y-axis to clearly see the data)



Fig. 5.4. (a) Read access time (b) Write access time as a function of stress time under condition of relaxed stress of 10/90.

## CONCLUSION

This work proposed two fully half-select-free robust 11T SRAM cell topologies that are suitable for bit-interleaved architecture. The proposed 11T-1 and 11T-2 cells eliminate Read disturb, Write half-select disturb and improves the Write-ability by using power-cutoff and write '0'/ '1' only techniques. The 11T-1 and 11T-2 cells have shown higher read and write yields compared with 6T cell. Both the proposed cells successfully eliminate the floating node condition encountered in earlier power cut-off cells during write half-select. Monte-Carlo simulation confirms low-voltage operation without any additional peripheral Write-and Readassist circuits. The impact of BTI on the SRAM performance was also analyzed in a predictive 32nm High-k metal gate CMOS technology. Under static stress, the RSNM is found to be reduced for all cells. However, interestingly it was found that 11T-1 and 11T-2 cells have improved RSNM due to BTI. Moreover, the proposed cells improve WM, reduce write power and leakage power, increases write delay without any effect on read delay/power over time.

## REFERENCES

[1] N. Maroof and B. S. Kong, "10T SRAM Using Half-VDD Precharge and Row-Wise Dynamically Powered Read Port for Low Switching Power and Ultralow RBL Leakage," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 4, pp. 1193-1203, April 2017, DOI: 10.1109/TVLSI.2016.2637918.

[2] C. C. Wang, D. S. Wang, C. H. Liao and S. Y. Chen, "A Leakage Compensation Design for Low Supply Voltage SRAM," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no. 5, pp. 1761-1769, May 2016, DOI: 10.1109/TVLSI.2015.2484386.

[3] J. P. Kulkarni and K. Roy, "Ultralow-voltage process-variation-tolerant Schmitt-trigger-based SRAM design," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 2, pp. 319–332, Feb. 2012, DOI: 10.1109/TVLSI.2010.2100834.

[4] H. Jiao, Y. Qiu and V. Kursun, "Variationstolerant 9T SRAM circuit with robust and low leakage SLEEP mode", IEEE 22nd International Symposium on On-Line Testing and Robust System Design (IOLTS), 2016, pp. 39-42, DOI: 10.1109/IOLTS.2016.7604668.

[5] B. Wang, T. Q. Nguyen, A. T. Do, J. Zhou, M. Je and T. T. H. Kim, "Design of an Ultra-low Voltage 9T SRAM with Equalized Bitline Leakage and CAM-Assisted Energy Efficiency Improvement," IEEE Trans. on Circuits and Systems I: Regular Papers, vol. 62, no. 2, pp. 441-448, Feb. 2015.doi: 10.1109/TCSI.2014.2360760.

[6] L. Chang, D. M. Fried, J. Hergenrother, J.W. Sleight, R.H. Dennard, R.K. Montoye, L. Sekaric, S.J. McNab, A.W. Topol, C.D. Adams, K.W. Guarini and W. Haensch et al., "Stable SRAM cell design for the 32 nm node and beyond," Symp. VLSI Technol. Dig. Tech. Pap., pp. 128–129, 2005, DOI: 10.1109/.2005.1469239.

[7] S China venkhateshwarlu, А karthik,Sandeepkumar,"Implementation of Area optimized Low power Multiplication and Accumulation"International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-9, Issue-1, November 2019pp 2928-2932. DOI: 10.35940/ijitee.A9110.119119

[8] S. Lin, Y.-B. Kim, F. Lombardi, "A low leakage 9t SRAM cell for ultra-low power operation", Proc. 18th ACM Great Lakes Symp. VLSI, pp. 123-126, 2008, DOI: 10.1145/1366110.1366141.